perf: Use OpenCV over PIL for PNG encoding in ImageRef.from_pil#562
perf: Use OpenCV over PIL for PNG encoding in ImageRef.from_pil#562maxdswain wants to merge 1 commit intodocling-project:mainfrom
ImageRef.from_pil#562Conversation
Signed-off-by: Max Swain <89113255+maxdswain@users.noreply.github.com>
|
✅ DCO Check Passed Thanks @maxdswain, all your commits are properly signed off. 🎉 |
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 Enforce conventional commitWonderful, this rule succeeded.Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
|
|
@maxdswain it would definitely be welcome to have this performance bottleneck addressed. However, opencv-python adds some intricacies, since it exists in both flavours: In order to support this cleanly, and knowing that other, optional third-party dependencies such as OCR engines in docling favour partially one and partially the other flavour, we would have to:
Even so it would not yet be a complete solution, since every dependent of Would you like to take these adjustments into account? |
Overview
The
ImageRef.from_pilclass method is used widely in docling's codebase. It is often used several times per page when parsing documents in thedocling_parse.pdf_parser.PdfDocument._to_bitmap_resources_from_decodermethod. From my profiling, I found that it took up ~45% of processing time when doing aDocumentConverterconversion with all AI models disabled. This led me looking into how it's performance can be improved.The function uses pillow to encode the image to a png, which is notoriously slow. So I swapped it out with opencv, improving the performance of this function by ~55% for this simple test case:
When using these changes in the main docling repo, it reduced by conversion time from 14.2 to 9.21 (~35%) when disabling all AI models.
One caveat is that I did add an extra dependency
opencv-python-headless, however this is already a dependency in the main docling repo'suv.lock.